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Probabilistic Grammars and Automata* 
EuGENE S. SANTOS 


Department of Mathematics, Youngstown State University, Youngstown, Ohio 


A mathematical formulation of probabilistic grammars, as well as the random 
languages generated by probabilistic grammars, is introduced. Various types of 
probabilistic grammars are considered. The relations between these grammars 
and the corresponding types of probabilistic automata are examined. 


I. INTRODUCTION 


Recently, there have been several attempts (Salomaa, 1969 and Ellis, 1969) 
to formulate the concept of probabilistic grammars and to define the random 
languages generated by probabilistic grammars. Unfortunately, none of these 
formulations are broad enough to encompass the conventional deterministic 
grammars and still preserve its probabilistic character. 

In the present paper, another formulation of probabilistic grammars is 
introduced. This formulation does not suffer from the shortcomings 
mentioned above. Moreover, it reduces to the conventional deterministic 
grammars if all functions involved are deterministic. 

Various types of probabilistic grammars are considered. They are type-0, 
context-sensitive, context-free, weak-regular and regular probabilistic 
grammars. They correspond to the various types of grammars in the con- 
ventional theory. 

The bulk of the paper is devoted to the study of the relations between 
the various types of probabilistic grammars and the corresponding types of 
probabilistic automata. 

In Section III, the concept of asynchronous probabilistic automata (APA) 
is defined. It is shown that every weak-regular random language is generable 
by an APA, and vice versa. Moreover, it is shown that every regular random 
language is generable by a synchronous APA, and vice versa. 

In Sections IV and V, the concepts of probabilistic Turing machines 
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(PTM) and probabilistic pushdown automate (PPA) are defined. It is shown 
that every bounded type-0 (leftmost-bounded context-free) random language 
is generable by a bounded PTM (PPA), and vice versa. 

It is well known (Matthews, 1964) that every language generated by any 
grammar using only leftmost derivations is context free. A similar result 
for random languages is also given in Section V. 


JI. PROBABILISTIC GRAMMARS 


In this section, a mathematical formulation of probabilistic grammar is 
introduced and the class of random languages generated by probabilistic 
grammars is defined. 


DEFINITION. Let X and Y be nonempty sets. A random function from X 
into Y is a function F from Y x X into [0, 1] such that XpeyF(y |x) <1 
for all xe X. If YyeyF(y | x) = 1 for all xe X, then F is a total random 
function. 


Remark. F(y |x) is the probability that the value of the function at x 
is y. 

Notation. Let C be a nonempty set. C* is the free semigroup with 
identity e generated by C, and C+ = C* — {e}. Moreover, if ae C*, then 
lg(a) denotes the length of «. 


DEFINITION. A (total) probabilistic production p over C is a (total) 
random function from C* into C* such that p(e | e) = 1 and the set 


p = {a EC*: ofr] a) > 0 for some 7eEC*, where 7 Æ o0 
P p 


is finite. If, in addition, for every o € C*, the set {r € C*: p(z | o) > 0} is 
finite, then p is bounded. An element of p is a genetrix of p. 


Remark. (zr | o) is the probability that ø will be replaced by +. 


DEFINITION. A (bounded) probabilistic grammar is a quadruple 
G = (T, N, P, h), where (i) T and N are disjoint finite nonempty sets; 
(ii) P is a finite collection of (bounded) probabilistic productions over T U N 
such that oe P = (Jp p implies oe (TU N)* N(T U N)*; and (iii) A is 
a function from N into [0, 1] such that Yuen A(A) < 1. 

In the above definition, T and N are, respectively, the terminals and the 
nonterminals, h(A) is the probability that A is the start symbol of G. 
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In order to define the random language generated by a probabilistic 
grammar, a few preliminary concepts are needed and are introduced below. 


Notation. J is the collection of all positive integers. 


DEFINITION. Let œ, oeC* and ke J. m(a, o) =k iff (if and only if) 
there exist u; , v; E€ C*, i=1,2,...,k, where v; Æ v; for i Æj, such that 
(i) «= mor; for all i = 1, 2,..., k, and (ii) « = pov implies u = p; and 
v =v; for some i where 1 Ki <k. For completeness sake, we define 
m(x, o) = 0 iff a + uov for all p, v e C*. 


Remark. m(«, o) = k iff « can be expressed in the form pov in exactly k 
distinct ways. 


DerFInition. Let G=(T, N, P, h) be a probabilistic grammar. A replace- 
ment function of G is a function ô from (T U N)* into P such that 8(a) = o 
implies m(a, o) > 0. If, in addition, 6(«) = o implies « = pov, where u € T* 
and v e (T U N)*, then 6 is leftmost. 


Notation. D(G) is the collection of all replacement functions of G, 
and D,(G) is the collection of all leftmost replacement functions of G. 


Remark. (a) == o iff some occurrence of o in « will be replaced. 

DEFINITION. Let a, B,o,7EC* and ke J. a~*8 mod(o, r) iff there 
exist u, v E C* such that œ = pov, 8 = prv and m(uo, o) = k. 

Remark. « ~* B mod(o, 7) iff B can be obtained from a by replacing 


the k-th occurrence of o in « by 7. 


DerINiTION. Let G=(T,N,P,h) be a probabilistic grammar and 
C=TUN. 


(1) For every peP, èe D(G) and ke J, we associate the function 
Ft,s,» from C* x C* into [0, 1], where 


if a ~ B mod(è(a), 7) 
otherwise. 


Fo.s,0(B | o) = | 8%) 


(2) For every €e(P x D(G) x J)*, we associate the function f, from 
C* x C* into [0, 1] defined inductively as follows: 


Hel=\) = oes 
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and 


faoa ABLA = E flv | ©) fiosa | 7). 
yeC* 


Remark. It is clear that f; is a random function from (T U N)* into 
(TU N)*. Moreover, ff] «) is the probability that B can be obtained 
from « by the ‘‘derivation” ¢. 


Derinition. A random language f over C is a function from C* into 
[0, 1]. 
Remark. f(x) is the probability that æ is a member of the language. 


Derinition. A random control set y of a probabilistic grammar 
G = (T, N, P, h) is a random language over P x D(G) x J. 


Remark. x(t) is the probability that the derivation ¢ will be applied. 
This concept of control set is similar to but distinct from that introduced by 
Ginsburg and Spanier (1968). 


DEFINITION. Let G = (T, N, P, h) be a probabilistic grammar and y 
a random control set of G. The random language fg , generated by G under y 
is the random language over 7, where 


foal) = uch. xO | E MA) flo A). 


Here, Z = P x D(G) x J and 1.u.b. stands for least upper bound. If 
y(6) = 1 for all ¢ e Z*, then we shall write fg for f¢,,. In this case, we say 
that fg is the random language generated by G. Moreover, if 


if Ce(P x D,(G) x {1})* 
otherwise, 


x$) = ‘6 


then we write fg 7 for fe,» . In this case, we say that fe, z is the random language 
generated by G with leftmost derivations only. 


Remark. f,,{c) is the probability that « will be generated by G under y. 

The probabilistic grammars defined above will be referred to as type-0 
probabilistic grammars. Other more restrictive types of probabilistic gram- 
mars can be obtained by imposing certain restrictions on the probabilistic 
productions. Four such probabilistic grammars are defined below, three of 
which correspond to the conventional context-sensitive, context-free and 
regular grammars (Hopcroft and Ullman, 1969). 
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Derinition. Let G = (T, N, P, h) be a probabilistic grammar. 


(1) G is context sensitive iff for every peP, p(r|o) > 0 implies 
Ig(o) < Ig(z). 7 

(2) G is context free iff PCN. 

(3) G is weakly regular iff for every p e P, pC N and for every o Ep, 
p(t | o) > 0 implies r = pA, where p e T* and AEN U {e}. 

(4) G is regular iff for every peP, p =N and for every oe, 
p(T | o) > 0 implies r = aA, where ae T and AEN U {e}. 


It is apparent that every language generated by a conventional type-0, 
context-sensitive, context-free or regular grammar in the conventional 
manner may be associated with fç for some type-0, context-sensitive, context- 
free or regular probabilistic grammar G = (T, N, P, h), where h, as well 
as all p e P, are deterministic, and vice versa. 

It follows from the above definition that every regular probabilistic 
grammar is weakly regular, and every weakly regular probabilistic grammar 
is context free. Moreover, if G is a context-free probabilistic grammar, 
then D,(G) contains exactly one replacement function of G. 

In what follows, if f = fe for some specified (type-0, context-sensitive, 
context-free, weak-regular and regular) probabilistic grammar G, then we 
shall say that f is that specific random language, and vice versa. 


THEOREM 2.1. If f is a (bounded) (type-0, context-sensitive, context-free, 
weakly regular, regular) random language over T, then f = fo for some (bounded) 
(type-O, context-sensitive, context-free, weakly regular, regular) probabilistic 
grammar G = (T, N, P, h), where (i) every peP is a total probabilistic pro- 
duction, (ii) h(Ay) = 1 for some Ay € N, and (iii) NCP. 


Proof. Let f = fe, , where Gy = (T, No , Py, ho) is a (bounded) (type-0, 
context-sensitive, context-free, weakly regular, regular) probabilistic gram- 
mar. Let Ay, A,¢ TUN, and a, an arbitrarily fixed element of T. Let 
N = N, U {A,, 4} and N, = N — P. For every pe P,, we associate the 
probabilistic productions p’ and p” over TUN such that for every 
0, 7TE(TUN)*, 


plr | o) if oeP, re(TUN,)* 
1— bg p(B | o) if ce P, t= aA; 

f Be(TUN,)* 

pelea if o=7¢PUN 


or oE Ni, T= aA, 
0 otherwise, 


643/21/1-3 
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and 
> h(A) p(7 | A) if o = Ay, re(TUN,)* 
AEN, 
a a $ &, WA) p(r|4) if o= Ay, T= aA, 
1 if i Eeen a x Ao 
0 otherwise. 


Define G = (T, N, P, h), where P is the collection of all p’ and p” defined 
above and 
{d if A=A, 
Sa 0 if AAA). 
It can be verified that G has the desired properties. 
We shall also write G = (T, N, P, A) if it satisfies condition (ii) above. 
Moreover, we shall say that G is total if it satisfies conditions (i)-(iii) above. 
We shall conclude this section by introducing the concepts of random 
domain and ranges of random functions, which are needed in subsequent 
discussions. 


DEFINITION. Let F be a random function from X into Y. 


(1) The random domain D(F) of F is the function from X into [0, 1] 
such that D(F)(x) = Sey F(y | x) for all x e X. 


(2) The random range R(F) of F is the function from Y into [0, 1] 
such that R(F)(v) = Lu.b.cy F(y | x) for all y € Y. 


III. WeAK-REGULAR PROBABILISTIC GRAMMARS AND ASYNCHRONOUS 
PROBABILISTIC AUTOMATA 


In this section, we shall study weak-regular probabilistic grammars and 
their relation with asynchronous probabilistic automata. 


Proposition 3.1. If G is a weak-regular probabilistic grammar, then 


Je = for: 


PROPOSITION 3.2. Jf f is a bounded weak-regular random language, then 
f= fe for some total weak-regular probabilistic grammar G = (T, N, P, h) 
such that for every peP and AEN, p(r| A) >0 implies + = aB, where 
ae TU {ce} and BENU {e}. 
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DEFINITION. An asynchronous probabilistic automaton (APA) is specified 
by a sextuple M = (U, S, V, p, h, g), where U, S and V are finite nonempty 
sets, p is a random function from U x S into S x V*, and h and g are 
functions from S into [0, 1] such that ¥ ses A(s) < 1. 

In the above definition, U, S and V are, respectively, the input, state 
and outputs sets. p(s’, y | u, s) is the conditional probability that the next 
state of M is s’ and output string y is produced given that the present state 
of M is s and input symbol u is applied. A(s) and g(s) are, respectively the 
probabilities that s is the initial state of M and s is a final state of M. 


Derinirion. Let M = (U, S, V, p, h, g) be an APA. 


(1) p™ is the function from S x V* x U* x S into [0, 1] such that 
for every s’, s” € S, ue U, x e U* and ye V*, 


1 if s =” 


pms", e | e, s’) Z lo f 


and 
pM(s", y | ux, s") = Y als, Vs | us 5") p(s", Ya | x, 8), 


where the summation ranges over all seS and 4,,y,¢V* such that 
Y=INWIe- 

(2) F™ is the function from V* x U* into [0, 1] such that for every 
xe U* and ye V*, 


FM(y |x) = X AE) p, y | x, 5) gl’). 


S'E 


Remark. p™(s", y | x, s’) is the conditional probability that M will be in 
state s” and produce output string y given that the present state of M is s’ 
and input string x is applied. F™(y | x) is the probability that M will produce 
y when x is applied. 

The above model of APA is a generalization of the model introduced by 
Starke and Thiele (1970). 

It is clear that FM is a random function from U* into V*, and D(F™) and 
R(E™) are random languages over U and V, respectively. Using the terminol- 
ogies introduced by Scott (1967), the class of all F™ is the class of all random 
functions computable by APA, the class of all D(F™) is the class of all random 
languages acceptable by APA, and the class of all R(F™) is the class of all 
random languages generable by APA. 
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DEFINITION. Let M = (U, S, V, p, h, g) be an APA. 


(1) M is bounded iff for every s,s' € S and ue U, p(s’, y| u, s) =0 
except for finitely many y e V*. 


(2) M is synchronous iff for every s, s’ € S and u E€ U, p(s’, y | u, s) > 0 
implies y e V. 


A synchronous total APA reduces to a stochastic sequential machine if 
g(s) = 1 for all se S. Moreover, it can be verified that a random language 
is acceptable by a synchronous APA iff it is realizable by a conventional 
probabilistic automaton (Santos, 1972). 


THEOREM 3.3. The following statements are equivalent 


(1) fisa weak-regular random language; 
(2) f= R(E™) for some APA M = (U, S, V, p, h, g), where 
(i) pis a total random function from U X S into S X V*, 
(ii) A(s9) = 1 for some sy € S, and 
(iii) there exists seS such that g(s,)=1 and g(s) =0 for all 
s 5,3 and 
(3) f= R(FM) for some APA M. 


Proof. Suppose f is a weak-regular random language. Then, by Theorem 
2.1, f = fo for some total weak-regular probabilistic grammar G = (T, N, 
P, Ao). 

Let A,, A,¢ TU Nand Ny = N U {A,, Ap}. Define M =(P, No, T, p, h, 8), 
where for every p € P, «e T* and A, A' e No, 


poA JA) if A, A’EN 


jela) if AeN, A =A, 
P(A’, a| p, A) = 1 if A’ = A, and Aeé{A, , Aj} 
0 otherwise 
a(4) = lo if AXA, 
and 


1 if A= Ay 
AVS lo if A «Ay. 
It can be verified that M has the desired properties. Thus (1) implies (2). 
(2) implies (3) is trivial. Now suppose that f= R(F™) for some APA 
M = (U, S, V, p, h, g). Without loss of generality, we may assume that 
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SOV = @ (empty set). For every ue U, we associate the probabilistic 
production p, over V U S, where p, = S and 


_ (els, ylums) if r= ys 
put | $) = j otherwise. 


Moreover, let pọ be the probabilistic production over V U S, where py = S 
and 

if r=e 

otherwise. 


pote Ls) = gO 


Define G = (V, S, P, h), where P = {po} U {pu : u E U}. It can be verified 
that f = fe . Thus (3) implies (1). 


THEOREM 3.4. The following statements are equivalent: 


(1) fis a bounded weak-regular random language; 
(2) f= RE”) for some APA M = (U, S, V, p, h, g) satisfying condi- 
tions (i)-(iti) of Theorem 3.3 (2) and 
(iv) for every s, s'e S andue U,p(s', y | u, s) > 0 implies y e V U {e}; 
and 
(3) f = RE”) for some bounded APA M. 


Proof. The proof follows from Proposition 3.2 and the proof of 
Theorem 3.3. 


‘THEOREM 3.5. (a) If f is a regular random language, then f = R(F™) for 
some synchronous APA M satisfying conditions (i)-(iii) of Theorem 3.3. 
(b) If f = R(FM) for some synchronous APA M, then there exists a 
regular probabilistic grammar G such that f(x) = fola) for all a # e. 


Proof. (a) follows from the proof of Theorem 3.3. Now, suppose 
f= RE"), where M = (U, S, V, p, h, g) is a synchronous APA. Without 
loss of generality, assume SMV = ø. For every ue U, we associate the 
probabilistic productions p„* and p,2, where p,! = p,? = S and 


1 _ §P(s’, v]u s) if + =o 
pelei) = lo otherwise, 
Y peolu sgle) if r= 
pars) = | 


otherwise, 
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for all seS, veV and re(VUS)*. Let G = (V, S, P, h), where 
P = {p„1: u E U} U {p,2: ue U}. It can be verified that f(a) = fela) for all 
œe V+, This completes the proof of (b). 

The above corollary states that every regular random language is generable 
by a synchronous APA, and vice versa. For deterministic languages, it is 
well known that every regular language is also acceptable by a finite automa- 
ton, and vice versa. Unfortunately, this is in general not valid for random 
languages. However, a sufficient condition is given below. 


THEOREM 3.6. If f = fe for some regular probabilistic grammar 
G = (T, N, P, h), where P contains exactly one element, then f is acceptable 
by some synchronous APA. 


Proof. It follows from the proof of Theorem 3.1 that f = R(F™) for 
some synchronous APA M=(U,S,T,p,h,g), where U = {up}. Let 
M' = (T, S, {uo}, p’, h, g), where for every s, s'e S and ve T, 


P'(S', to | 2 8) = Pls’, V | tlo » $). 


It is clear that M’ is a synchronous APA. Moreover, it can be verified that 
f= DF"). 


IV. Type-O PROBABILISTIC GRAMMARS AND PROBABILISTIC 
TuRING MACHINES 


In this section, we shall study type-0 probabilistic grammars and their 
relation with probabilistic Turing machines. 


DEFINITION. A probabilistic Turing machine (PTM) is specified by a 
sextuple M = (U, S, V, W, p, h), where U, S, V and W are finite nonempty 
set, UU VCW, SOW = @, p is a random function from W x S into 
S x (W* U {+, -}), +, —¢W, and h is a function from S into [0, 1] 
such that X ses A(s) < 1. 

In the above definition, U, V and W are, respectively, the set of input, 
output and tape symbols. S is the state set. A(s) is the probability that s is 
the initial state. p(s’, z | w, s) is the conditional probability of the “next act” 
of the PTM given that its present state is s and the tape symbol w is scanned. 
Like the conventional Turing machines, the “next act” of the PTM may be 
one of the following: 
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(1) ge W*: replace w by z and go to state s’. 
(2) z = +: move one square to the right and go to state s’. 


(3) z= —: move one square to the left and go to state s’. 


In what follows, if M = (U, S, V, W, p, h) is a PTM, then we shall 
assume that b e W and b¢ U U V. The symbol b stands for blank. 


DEFINITION. Let M = (U, S, V, W, p, h) be a PTM and ae (W U S)*. 
æ is an instantaneous description of M iff (i) « contains exactly one seS 
and s is not the rightmost symbol of «, (ii) the leftmost symbol of « is not b, 
and (iii) the rightmost symbol of « is not b unless it is the symbol imme- 
diately to the right of s. 


Notation. I(M) is the collection of all instantaneous descriptions of M. 


Derinition. Let M = (U, S, V, W, p, h) be a PTM. 


(1) p™ is the function from Z(M) x I(M) into [0, 1] such that for 
every a, BEI(M), 


/p(s’, z | w, s) if œ = Cswd, B = bs'zò, 28 He 
or a = Csw, B = bs'b, z =e 

p(s’, + |e, s) if a = Csww'd, B = lws'w'd, Cw + b 
or a = sww'd, B = s'w'd, w = b 

| or a = sw, B = tws'b, tw +b 

p(B | a) =< or a = sw, p = sb, w =b 

pls’, — | w, s) if a = lw'swd, B = Cs’w'wd, wd Æ b 
or a = ģw'sw, B = C's'w’, w = b 
or a = swð, B = s'bwds, wô Æ b 
or « = sw, B = sb, w =b 

0 otherwise, 


where 2,56 W*, s,s’ € S, w, w e W and z e W*. 
(2) For every ke JU {0}, p,” is the function from [(M) x (M) 
into [0, 1] such that for every «, 8 e (M), 


if a= 6 
if a 48, 


PaB lo) = È PBID | a). 
EIM} 


BEID = f 
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(3) For every ke J, q,™ is the function from IM) x I(M) into [0, 1] 
such that for every a, 8 e (M), 


El = p81) [1—- E E pe elas], 


s’ES zez 


where Z = W* U {+, —}, s is the state symbol contained in 8 and w is the 
symbol contained in 8 which is immediately to the right of s. 


(4) F™ is the function from V* x U* into [0, 1] such that for every 
xe U* and ye V*, 


Puy |x) = YY MO] S ael], 


<ad=y seS 


where <a> is the output string obtained from a by striking out all symbols 
in œ not belonging to V. 


Remark. (1) p™(B|«) is the conditional probability that the “next” 
instantaneous description is f given that M “starts” with instantaneous 
description «. 


(2) pxM(8|«) is the conditional probability that the instantaneous 
description of M is 8 “after k steps” given that M “starts” with a. 


(3) “(8 | «) is the conditional probability that M will “terminate” 
with £ “after k steps” given that M “starts” with a. 


(4) F™(y | x) is the conditional probability that the output of M is y 
given that input x is applied. 


It can be verified that F™ is a random function from U* into V*. 


Derinirion. Let M = (U, S, V, W, p, h) be a PTM. 
(1) M is bounded iff for every s, s'e S and we W, p(s’, z | w, s) =0 
except for finitely many z e W* U {+, —}. 
(2) M is synchronous iff for every s, s' € S and we W, p(s’, z | w, s) >0 
implies ze W U {+, —}. 


The above model of PTM differs slightly from those given in (Santos, 
1969 and 1971). 
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ProposiTION 4.1. Every random function computable by a bounded PYM 
zs computable by a synchronous PTM. 


PROPOSITION 4.2. If F is a random function computable by a PTM, then 
F=f" for some PIM M=(U,S,V,W,p,h) satisfying the following 
conditions: 

(a) There exists s, € S such that h(s,) = 1; 

(b) For every seS and we W, p(s',z|w,s)>0 implies ze W*, 
P(s’, z | w, s) > O implies z = +, or p(s’, z | w, s) > 0 implies z = —; and 

(c) There exists s, € S such that 


(i) p(s, z | w, sa) = 0 for all s e S, we W and z e W* U {+, —}, 
(ii) for every we W and s e S wheres # sy, 


Y Dd as, zlw s) =l, 
S'ES 26Z 
where Z = W* O {+, —}, and 
(iii) for every xe U*, Era aMla ls, x) > 0 implies x = s y, where 
yeV*, 


THEOREM 4.3. If f is generable by a (bounded) PTM, then f is a (bounded) 
type-0 random language. 


Proof. Let f = R(F™), where M = (U, S, V, W, p, h) is a (bounded) 
PTM satisfying conditions (a)-(c) of Proposition 4.2. Let A,, A,,¢,$¢€W 
and W, = WU {Ay, 4 , ¢, $} Let G = (V, N, P, h’), where N = W,—V, 
k'(Ao) = 1, P = {po pis Pa» Ps» Pat U {py u€ U}, Po = {Ao}, Pu- = {Ay} for 
all uweU, pr = {Ay}, Po = {8}, Ps = {¢, $}, Pa CiswiseS,we Wa U 
{w'sw: se S and w, w e Wo}, po(¢4,$| Ay) = 1, pullu | A,) =1 for all 
uE U, px(s, | Ay) = 1, pale | 8) = 1, pale | £) = pale | $) = 1, and 


p(s’, z | w, s) if o=sw, r=s'z, w#Æ$ 
je; + |w, s) if o= sw, t =w, wH $F 
_ Jp, — |w, s) if o= wsw, t= sww, w A¢ 
fale 1 = Vy if o=s$, 7 = sb$ 
or o = ¢sw, +r = ¢bsw 
0 otherwise. 


It is clear that G is a (bounded) type-0 probabilistic grammar. Moreover, 
it can be verified that f = fg. 
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THEOREM 4.4. If fis a bounded type-0 random language, then f is generable 
by a synchronous PTM. , 


Proof. Let f = fe, where G = (T, N, P, h) is a bounded type-0 proba- 
bilistic grammar. Without loss of generality, we may assume that G is total 
and h(A,) = 1 for some AEN. For every o e(T U N)*, we associate the 
abstract symbol G. Let Q, = {6: o e P} and Q, = {7: p(x | o) > 0 for some 
peP and cep}. Let M = (U, S, V, W, p, h’) be a synchronous PTM, 
where U = P U Q1 U {1}, PUQ, UQ, U {1, b} C W and A(so) = 1 for some 
59 E S. We shall describe informally the behavior of M: 


(a) Suppose M has instantaneous description sx € S(P U Q, VU {1})*. 
Then M will go to state s if x ¢ (PO,*{1}*)*, where p(s, , w | w, sı) = 1 for 
all we W. In other words, M loops. Otherwise, M will have instantaneous 
description s,xbbAy , 5, E€ S. 


(b) Suppose M has instantaneous description sz, where g = 
PĒ °° GmkSybF bay Shu t Bnb, nO, peP, &EQ, i= 1, 2,.., m, 
zo E (PO,*{1}*)*, z; € (PQ1Q:)*, a; €(T U N)*, i = 1, 2,..., n, and k stands 
for 11 --: 1 (k times). Then M will go to state s, if (i) n = 0, (ii) m < n, or 
(iii) for some i and j where 1 <i, j <n, a; = a; but o; 4 o; . Otherwise, 
M will have instantaneous description sz, s3 € S. 

(c) Suppose M has instantaneous description s,z, where g is the same 
as in (b) satisfying none of the conditions (i)-(iii). Then M will have instan- 
taneous description 5,2’, where 2’ is obtained from z by (i) erasing pd,G, ` 6k, 
(ii) erasing z,ba, if m(«,; , o;) < k, and (iii) replacing z;ba; by 


BPG TibBir% pF 7 bP ig -* 3:pFTirbBer if m(a;,0,) > k. 


Here, By ~ a; mod(o;, Ta) j = 1l, 2, r, and {ta 5 Tig es Tin} = 
{r: p(z | o;) > 0}. Thereby, we assume that for every peP and cep, the 
set {r: p(z | o) > 0} is well ordered. 

(d) Suppose M has instantaneous description sbg, where ze W*, 
then M will go to state s, if z = e. Otherwise, M will have instantaneous 
description s,z, where s, E€ S. 

(e) Suppose M has instantaneous description syz, where 2 = 
gybozabas Zabaan, MS 1, 2; E(PQQ:)*, ue (TU N)*, i= 1, 2,..., 2. If 
2, % e and for all i where 1 <i < n, z; starts with po where p E€ Pand ë EQ}, 
then M will have instantaneous description s, 2, where Spo E S. If 2, =e, 
then M will have instantaneous description s,z’, where 2’ is obtained from g 
by erasing b, and s; € S. If z = e, then M will go to state sı . 
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(f) Suppose M has instantaneous description s, 2, where z is the 
same as in (e) and for all 7 where 1 <i <x, z; starts with po. Then, with 
probability p(7 | o), M will have instantaneous description 5,2’, where 2’ is 
obtained from z by erasing all zba; where z; does not start with paz, and 
erasing pē? from all z; which starts with paz. 


(g) Suppose M has instantaneous description ssx, where ae W*. 
Then M will go to state s if z ¢ T*. Otherwise, M will have instantaneous 
description sgx, where for every seS, weW and zeW*U{4+, —}, 
pls, z | w, sẹ) = 0. i 


Observe that M acts probabilistically only when case (f) occurs. Otherwise, 
it acts deterministically. It can be verified that f = R(F™). 
Combining Theorems 4.3 and 4.4 yields 


THEOREM 4.5. Every bounded type-O random language is generable by a 
synchronous PTM, and vice versa. 


By virtue of the above theorem, many interesting properties of bounded 
type-0 random languages can be obtained from the results given in (Santos, 
1971). 


V. CONTEXT-FREE PROBABILISTIC GRAMMARS AND PROBABILISTIC 
PusHDOWN AUTOMATA 


In this section, we shall study context-free probabilistic grammars and 
their relation with probabilistic pushdown automata. Moreover, the relation 
between leftmost random languages and leftmost context-free random 
languages is also investigated. 


ProposiTION 5.1. If f is a context-free random language, then f = fg 
for some total context-free probabilistic grammar G = (T, N, P, h) such that 
P = N, h(Ay) = 1 for some A, € N, and for every p e P and o € ĵ, plr | o) >0 
implies r = av, where ac T U {e} and ve N*. If, in addition, f is bounded, 


then for every p € P and a €P, p(T | o) > 0 implies r = av, where ae T U {e} 
and ve NN. 


In what follows, if f = f¢,, for some specified (type-0, context-sensitive, 
context-free, weak-regular and regular) probabilistic grammar G, then we 
shall say that f is a leftmost, that specific random language. 
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THEOREM 5.2. If G = (T, N, P, h) is a context-free probabilistic grammar 
such that for every a e T*, 


felo) = lub, X WA) f(a | A), 
1 AEN 


where Z, = P x D,(G) x J, then fg is leftmost context free. Indeed, fg = fo.. 


Proof. By a previous remark, D;(G) contains exactly one replacement 
function, say ô . For every 


E = bip, 89, k) ba E (P x {85} x J)*, 


where €,€(P xX {ôo} x {1})* and k > 1, let 6’ = G(p, 8), 1) ča. It can be 
verified that f(a | A) < f(a | A) for all Ae N and «e T*. Thus, for every 
feE(P x {8o} X J)*, there exists 5 € (P X {89} x {1})* such that f,(a | A) < 
Feo | A) for all Ae N and a e T*. Hence, fe = fe,1. 


THEOREM 5.3. If G = (T, N, P, h) is a context-free probabilistic grammar 
such that P contains exactly one element, then fg is leftmost context free. Indeed, 


te = fo,L . 


Proof. Let P = {p} and D,(G) = {6 }. It can be shown by induction on 
Ig(Z), ¢e(P x D(G) x J)*, that for every ¢e({p} x D(G) x J)*, there 
exists ¢’ €({p} X {89} x {1})* such that f(a | A) < fela] A) for every 
xe T* and A e N. Thus, fe = fe,- 


DEFINITION. A (total) probabilistic pushdown automaton (PPA) is 
specified by a septuple M = (U, S, V, W, p, h, g), where U, S, V and W 
are finite nonempty sets, p is a (total) random function from U x W x S$ 
into S x W* x V*, h is a function from S x W into [0,1] such that 
Doses Z wew A(s, w) = 1, and g is a function from S into [0, 1]. If, in addition, 
for every s,s'eS, ueU and we W, p(s’, z, y |u, w, s) =0 except for 
finitely many z e W* and y e V*, then M is bounded. 

In the above definition, U, S, V and W are, respectively, the input, state, 
output and pushdown alphabets. p(s’, z, y | u, w, s) is the conditional proba- 
bility that the next state of M is s’, the leftmost symbol w in the pushdown 
list is replaced by z and output string y is produced, given that the present 
state of M is s, the leftmost symbol in the pushdown list is w and input u is 
applied. (s, w) is the probability that s is the initial state of M and w is the 
initial symbol in the pushdown list. g(s) is the probability that s is a final 
state of M. 
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Derinition. Let M = (U, S, V, W, p, h, g) be a PPA. 


(1) p™ is the function from (S x W* x U* x V*) x (Sx W* x U* x V*) 
into [0, 1] such that for every a, 8e S x W* x U* x V* 


LS, Zo : Yo | u, W, s) if a= (s, WZ, UX, y), B = (s, Zo? x; Vo) 


oJ if a = (s, e, ux, yY), B = (s, e, x, y) 
ea or a =f = (aey) 
0 otherwise, 


where s, s’ e S, ue U, xe U*, yy y E V*, we W and z, z e W*, 
(2) For every ke J U {0}, p4” is the function from (S x W* x U* x V*) x 
(Sx W* x U* x V*) into [0, 1] such that for every a, Be Sx W* x U* x V*, 


if a= 


PMB | a) = r if a % B, 


Prall la) = L Pes ly) p” la), 


where l = S x W* x U* x V*. 


(3) F™ is the function from V* x U* into [0, 1] such that for every 
xe U* and ye V*, 


mol)= Z X i [5 bie heals ace 9| 


(4) G™ is the function from V* x U* into [0, 1] such that for every 
xe U* and ye l*, 


omly= YY May Gr aerlsmae} 
s,s'ES wEW zeW* k=0 

It can be verified that FM and GM are both random functions from U* 
into V*. Moreover, for every x e U* and y e V* where lg(x) = k, 


FM(y|x)= Yd) Ms, w) Prl, e, e, y |s, w, x, e) 


8, ES wew 
and 


GMyix)= YD Me h(s, w) g(s") PeM(s', z, e, y | s, w, x, €). 


8,S'ES weW zew 


It is clear from the above definition that PPA are stochastic generalizations 
of conventional pushdown automata. F™ is the random function computed 
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by M with empty store, while GM is the random function computed by M 
with final states. 


‘THEOREM 5.4. If f is a leftmost (bounded) context-free random language, 
then f is generable with empty store by some total (bounded) PPA having a 
single state and p(s’, z, y | u, w, s) > O implies ye T U {e}. 


Proof. Let f =f¢,,, where G = (T, N, P, h) is a (bounded) context-free 
probabilistic grammar. By Proposition 5.1, we may assume that G is total 
and for every p E P and o €p, p(T | o) > 0 implies r = av where ae T U {e} 
and ve N*. Let M = (P, {59}, T, N, p, A’, g), where h'(s,, A) = h(A) and 
Dlo, 2, Y | P, W, So) = pl yz | w) for all z e N* and ye T*, pe P and we N. 
It is clear that M is a PPA with the desired properties. Moreover, it can be 
verified that f = R(F™). 


THEOREM 5.5. If f is a leftmost (bounded) context-free random language, 
then f is generable with final states by some total (bounded) PPA having two 
states and p(s’, z, y | u, w, s) > 0 implies y e T U {e}. 


Proof. Let f = fe,, where G = (T, N, P, h) is a (bounded) context-free 
probabilistic grammar. By Proposition 5.1, we may assume that G is total 
and for every p € P and o €f, p(T | o) > 0 implies + = av, where a e T U {e} 
and v e N*. Moreover, we may assume that, without loss of generality, there 
exist uniquely A), 4&1, &42EN and pọ, p1EP such that (i) A(Ay) = 1, 
(ii) So = {Ag} and pA; | Ay) = 1, (iii) p, = {Ay} and p,(e| 4) = J, 
(iv) for i =0, 1, A,;ep implies p = p;, (v) for every peP and cep, 
plr |o) > 0 implies 7 ¢(T U N)* {4T U N)*, and (vi) for every peP, 
cep and p # py, plr | o) > 0 implies 7 ¢ (T U N)* {A }(T Y N)*. In other 
words, we modify G in such a way that A, serves as an endmarker. Let 
M = (P, S, T, N, p, h’, g) where S = {s4 , S2}, h'(s,, Ag) = 1, g(sg) = 1 and 
for every s, s € S, pe P, we N, z e N* and y e T*, 


Ps’, z, Y | p, w, $) 


pley | w) if s=% =3s,,wAA, orp#*p 

1 if s= S$, S = Ss, W= =A, p= ye 
or s =S = Ss, W =Z, Y =e 

0 otherwise. 


Clearly, M is a PPA with the desired properties. Moreover, it can be verified 
that f = R(GM). 


Notation. Let C be a nonempty set and ke J. C® = {a e C*: lg(a) <A}. 
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THEOREM 5.6. Iff = R(E™) for some bounded PPA M, then f is a leftmost- 
bounded context-free random language. 


Proof. Let M = (U, S, V, W, p, h, g). Since M is bounded, there exists 
keJ such that for every s s'es, ueU, we W, ye V* and zeW*, 
p(s’, z, y |u, w, s) >0 implies Ig(z) < k. Let Ny = {(s, w, 5’): s,s E S and 
we Wy and N = N, U {Aj}, where A) ¢ No . For simplicity, we shall write 
(S182 0 Sn s WyWe, °° Wp y S132 *** Sn) FOr (Sy, Wy, Sy')(Sq > Wa 5 S2) U (Sno Wns Sn')s 
where (s; , w; , s) € No , í = 1, 2,..., n. Let Q be the collection of all functions 
q from W* into S® such that g(z) = r implies lg(r) = lg(z) — 1. For every 
ue U and géQ, let pu, be the probabilistic production over V U N, where 
Pua = No and for every A e Na, rEe(V U N)*, 


Pu,a(t | 4) 
_ (p(s, 2,9 | u, w, s) if A = (s,w,s) and + = y(s9(z), 2,9(2)s’) 
~ (0 


otherwise. 


Moreover, for every s €'S, let p, be the probabilistic production over V U N 
where p, = {4o} and 


h(s, w) if 7 = (s, w, s’) 
0 


i == 
ps(z | Ao) = otherwise. 


Let G=(V,N,P,h’), where h'(A,) =1 and P ={p,,:u€U,qeEQhu 
{ps :sE 8}. Clearly, G is a bounded context-free probabilistic grammar. 
Moreover, it can be verified that f = fg.,. 


THEOREM 5.7. If f = R(G™) for some bounded PPA M, then f is a leftmost- 
bounded context-free random language. 


Proof. By virtue of Theorem 5.6, it suffices to show that f = R(F™o) 
for some bounded PPA M,. Let M = (U, S, V, W, p, h, g) and 
My = (Up, So» V, Wos Po» ho +8), where Uo = U U {uo , uy}, to, 14 ¢ U, 
So = SU {59 5}, 59, 1 ES, Wo = WU fwo, $, w, SEW, hols, wo = 1, 
and for every s, s E€ Sọ, u E€ Ug, wW E Wg, yEeV,*, 2 E Wy*, 


pols’; 2, y | U, W, s) 


A(s, w’) if S$ = So, W = Wg, U = Ug, S ES, z = w$, y =e 
pl, zy |u, w, s) if s,s ES, we W, ue U, ye V*, ze W* 

= (g(s) if seS, s =s, we Wg, u =4, z=y=e 
1 if s= =54, we Wi, u= uy, z =y =e 


0 otherwise. 
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Clearly, M, is a bounded PPA. Moreover, it can be verified that 
f= FEM). 
Combining Theorems 5.4, 5.5, 5.6 and 5.7 yields 


THEOREM 5.8. The following statements are equivalent: 


(1) fis a leftmost-bounded context-free random language. 

(2) f= RE”) for some bounded PPA M. 

(3) f= R(GM) for some bounded PPA M. 

(4) f= R(F™) for some total bounded PPA M, where p(s’, z, y | u, w, s) >0 
implies ye V U {e}. 

(5) f= R(GM) for some total bounded PPA M, where p(s’, z, y | u, w, s) >0 
implies y E€ V U fe}. 

It is well known (Matthews, 1964) that languages generated by any 

grammar using only leftmost derivations are context free. A similar result 
will be shown below for random languages. 


THEOREM 5.9. If f= fe. for some bounded probabilistic grammar 
G = (T, N, P, h) such that P = {p}, then f is leftmost-bounded context free. 


Proof. For every o €p, let A,’ denote the i-th symbol of o. Moreover, 
for every o €p, we shall associate the symbols u,°, s,°, i = 1, 2,..., lg(o) — 1. 
Let M = (U, S, T, W, p, k', g), where U = {p} U {u;°: o e p andi = 1, 2,..., 
Ig(o) — 1}, S = {s9} U {s,2: cep and i= 1, 2,..., lgo) — 1}, W= TUN, 
and for every s, s’ € S, ue U, we W, y e T* and ze W*, 


p(T | 0) if S = Shy, w = A,’, u = p, 
S=, F=7, y=e, k= lg(o) 


1 if (u, w, s) = (tk, Ay, Sta), S = Sx’, 
z=y=e, k< ig(o) 
p(s’, z, | u, w, s) = or s=% =5%, WET and w #4’ 


for any oeP, z=e, y =w 
or (u,w,s) is not equal to any of those 


r 


given above, s Ss F=wW, y =e 


0 otherwise, 
; a if s=5, weN 
RG) otherwise. 


Clearly, M is a bounded PPA. Moreover, it can be verified that f == R(F™). 
Thus, by Theorem 5.9, f is leftmost-bounded context free. 


Receivep: July 16, 1971; Revisep; April 7, 1972 
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